28 research outputs found
Towards Accountable AI: Hybrid Human-Machine Analyses for Characterizing System Failure
As machine learning systems move from computer-science laboratories into the
open world, their accountability becomes a high priority problem.
Accountability requires deep understanding of system behavior and its failures.
Current evaluation methods such as single-score error metrics and confusion
matrices provide aggregate views of system performance that hide important
shortcomings. Understanding details about failures is important for identifying
pathways for refinement, communicating the reliability of systems in different
settings, and for specifying appropriate human oversight and engagement.
Characterization of failures and shortcomings is particularly complex for
systems composed of multiple machine learned components. For such systems,
existing evaluation methods have limited expressiveness in describing and
explaining the relationship among input content, the internal states of system
components, and final output quality. We present Pandora, a set of hybrid
human-machine methods and tools for describing and explaining system failures.
Pandora leverages both human and system-generated observations to summarize
conditions of system malfunction with respect to the input content and system
architecture. We share results of a case study with a machine learning pipeline
for image captioning that show how detailed performance views can be beneficial
for analysis and debugging
Social Biases through the Text-to-Image Generation Lens
Text-to-Image (T2I) generation is enabling new applications that support
creators, designers, and general end users of productivity software by
generating illustrative content with high photorealism starting from a given
descriptive text as a prompt. Such models are however trained on massive
amounts of web data, which surfaces the peril of potential harmful biases that
may leak in the generation process itself. In this paper, we take a
multi-dimensional approach to studying and quantifying common social biases as
reflected in the generated images, by focusing on how occupations, personality
traits, and everyday situations are depicted across representations of
(perceived) gender, age, race, and geographical location. Through an extensive
set of both automated and human evaluation experiments we present findings for
two popular T2I models: DALLE-v2 and Stable Diffusion. Our results reveal that
there exist severe occupational biases of neutral prompts majorly excluding
groups of people from results for both models. Such biases can get mitigated by
increasing the amount of specification in the prompt itself, although the
prompting mitigation will not address discrepancies in image quality or other
usages of the model or its representations in other scenarios. Further, we
observe personality traits being associated with only a limited set of people
at the intersection of race, gender, and age. Finally, an analysis of
geographical location representations on everyday situations (e.g., park, food,
weddings) shows that for most situations, images generated through default
location-neutral prompts are closer and more similar to images generated for
locations of United States and Germany
Mitigating Spurious Correlations in Multi-modal Models during Fine-tuning
Spurious correlations that degrade model generalization or lead the model to
be right for the wrong reasons are one of the main robustness concerns for
real-world deployments. However, mitigating these correlations during
pre-training for large-scale models can be costly and impractical, particularly
for those without access to high-performance computing resources. This paper
proposes a novel approach to address spurious correlations during fine-tuning
for a given domain of interest. With a focus on multi-modal models (e.g.,
CLIP), the proposed method leverages different modalities in these models to
detect and explicitly set apart spurious attributes from the affected class,
achieved through a multi-modal contrastive loss function that expresses
spurious relationships through language. Our experimental results and in-depth
visualizations on CLIP show that such an intervention can effectively i)
improve the model's accuracy when spurious attributes are not present, and ii)
directs the model's activation maps towards the actual class rather than the
spurious attribute when present. In particular, on the Waterbirds dataset, our
algorithm achieved a worst-group accuracy 23% higher than ERM on CLIP with a
ResNet-50 backbone, and 32% higher on CLIP with a ViT backbone, while
maintaining the same average accuracy as ERM
Is the Most Accurate AI the Best Teammate? Optimizing AI for Teamwork
AI practitioners typically strive to develop the most accurate systems,
making an implicit assumption that the AI system will function autonomously.
However, in practice, AI systems often are used to provide advice to people in
domains ranging from criminal justice and finance to healthcare. In such
AI-advised decision making, humans and machines form a team, where the human is
responsible for making final decisions. But is the most accurate AI the best
teammate? We argue "No" -- predictable performance may be worth a slight
sacrifice in AI accuracy. Instead, we argue that AI systems should be trained
in a human-centered manner, directly optimized for team performance. We study
this proposal for a specific type of human-AI teaming, where the human overseer
chooses to either accept the AI recommendation or solve the task themselves. To
optimize the team performance for this setting we maximize the team's expected
utility, expressed in terms of the quality of the final decision, cost of
verifying, and individual accuracies of people and machines. Our experiments
with linear and non-linear models on real-world, high-stakes datasets show that
the most accuracy AI may not lead to highest team performance and show the
benefit of modeling teamwork during training through improvements in expected
team utility across datasets, considering parameters such as human skill and
the cost of mistakes. We discuss the shortcoming of current optimization
approaches beyond well-studied loss functions such as log-loss, and encourage
future work on AI optimization problems motivated by human-AI collaboration.Comment: v